January 23, 2005
Probabilistic Model of Range
Update: I have improved data for the models, so I've updated the table in a new post. The order changes a little, but not enough to make a big difference.
Sorry, I hit the save instead of the preview button for this post. An explanation will be added shortly.
2004 Probabilistic Model of Range, Totals for Teams
Team | InPlay | Actual Outs | Predicted Outs | DER | Predicted DER | Difference |
Angels | 4360 | 2990 | 3080.32 | 0.686 | 0.706 | -0.02072 |
Royals | 4643 | 3127 | 3211.12 | 0.673 | 0.692 | -0.01812 |
Yankees | 4488 | 3081 | 3158.71 | 0.686 | 0.704 | -0.01732 |
Tigers | 4524 | 3091 | 3169.20 | 0.683 | 0.701 | -0.01729 |
Orioles | 4458 | 3058 | 3125.39 | 0.686 | 0.701 | -0.01512 |
Pirates | 4326 | 2959 | 3023.71 | 0.684 | 0.699 | -0.01496 |
Reds | 4590 | 3155 | 3220.04 | 0.687 | 0.702 | -0.01417 |
Twins | 4491 | 3083 | 3140.70 | 0.686 | 0.699 | -0.01285 |
Mariners | 4490 | 3140 | 3183.09 | 0.699 | 0.709 | -0.00960 |
Brewers | 4416 | 3049 | 3086.30 | 0.690 | 0.699 | -0.00845 |
Rockies | 4620 | 3138 | 3174.15 | 0.679 | 0.687 | -0.00782 |
Expos | 4421 | 3067 | 3100.04 | 0.694 | 0.701 | -0.00747 |
Astros | 4151 | 2843 | 2866.27 | 0.685 | 0.691 | -0.00561 |
Indians | 4490 | 3069 | 3093.60 | 0.684 | 0.689 | -0.00548 |
Rangers | 4551 | 3124 | 3148.34 | 0.686 | 0.692 | -0.00535 |
Athletics | 4499 | 3127 | 3148.70 | 0.695 | 0.700 | -0.00482 |
Diamondbacks | 4320 | 2939 | 2955.30 | 0.680 | 0.684 | -0.00377 |
Braves | 4489 | 3088 | 3102.32 | 0.688 | 0.691 | -0.00319 |
Blue Jays | 4478 | 3097 | 3108.56 | 0.692 | 0.694 | -0.00258 |
Padres | 4393 | 3040 | 3050.63 | 0.692 | 0.694 | -0.00242 |
Giants | 4541 | 3148 | 3157.22 | 0.693 | 0.695 | -0.00203 |
Devil Rays | 4471 | 3127 | 3135.05 | 0.699 | 0.701 | -0.00180 |
Marlins | 4263 | 2991 | 2995.97 | 0.702 | 0.703 | -0.00117 |
Mets | 4557 | 3166 | 3170.73 | 0.695 | 0.696 | -0.00104 |
Phillies | 4452 | 3127 | 3129.24 | 0.702 | 0.703 | -0.00050 |
Dodgers | 4333 | 3089 | 3089.39 | 0.713 | 0.713 | -0.00009 |
White Sox | 4375 | 3038 | 3028.95 | 0.694 | 0.692 | 0.00207 |
Cubs | 4124 | 2873 | 2861.76 | 0.697 | 0.694 | 0.00273 |
Red Sox | 4391 | 3041 | 3028.85 | 0.693 | 0.690 | 0.00277 |
Cardinals | 4387 | 3112 | 3097.10 | 0.709 | 0.706 | 0.00340 |
Explanation: Last year, I worked on a way of measuring range which I called a Probabilistic Model of Range (see the defense archives). I was basically repeating work done by Mitchel Lichtman which he named the Ultimate Zone Rating (UZR). Since Mitchel's work was more mature than mine, and since I had to write new software because the source of my data changed, I did not puruse these ranking for the 2004 season. However, I just learned that Mr. Lichtman is working for the Cardinals (congratulations, Mike!) and won't be publishing his results anymore. There's a niche to fill, so here it goes.
I calculate the probability of a ball being turned into an out based on six parameters:
- Direction of hit (a vector).
- The type of hit (Fly, ground, line drive, bunt).
- How hard the ball was hit (slow, medium, hard).
- The park.
- The handedness of the pitcher.
- The handedness of the batter.
For each ball in play, the program sums the probability of that ball being turned into an out, and that gives us the expected outs. Dividing that by balls in play yields expected defensive efficiency rating (DER). That is compared to the team's actual DER. A good defensive team should have a better DER than it's expected DER.
There are differences between this year's and last year's calculation. I'm now using three years of data instead of just one. Also, Baseball Info Solution charts balls differently that STATS, Inc. so there are many more vectors that in the previous system. I believe that actually improves the calculation. Finally, the numbers above are approximate; my database is from early October, and BIS had not input every ball in play yet. Still, it should be enough to get a feel for how good teams were on defense in 2004.
The first thing to notice from the table is that it was a poor defensive season overall. Only four teams had a better DER than predicted by the model. The Cardinals and Red Sox were 1-2, and ended up the World Series. The Angels were last, but also made the playoffs. The Yankees continued their abysmal defense, while the Mets high ranking should help explain why so many of their pitchers had better ERAs than DIPS ERAs.
The next step is to use this method to evaluate individual fielders. Watch for that in upcoming posts.
Update: Just in case I wasn't clear on this, the model is built on three years data, but the chart above is just for 2004.
Correction: Corrected the spelling of Mitchel Lichtman's name.
The Yankees probably "continue their dismal defense" because you used 3 years of stats instead of 2. We all know how bad their defense was in 2002 and 2003. I'd like to see the stats for just 2004 so we can compare them to 2003's numbers and see which teams are heading in the right (and wrong) direction.
That being said: great work, David.
Outstanding, David. Thank you for running this again. I'm surprised the Angels come out so poorly. Maybe that explains their push for Finley and Cabrera. It will be interesting to see if your system ranks Finley as poorly as UZR does.
I know we discussed this last year, but I forget the answer: is this data adjusted for ballpark?
The three years worth of data is used to calculate the baseline against which last years results are compared.
For those of us interested in tracking stats and info but unable to wrap their brains around the math and the various acronyms, is there a way to turn the results into something simpler?
Thanks for running your system. I'm confused about the team out totals. It seems that if you add all the teams up, they collectively fielded close to 900 fewer balls than they should have.
Shouldn't this be centered by league average? Or maybe it's centered by the *three*-year average, which would imply that fielding was below average in 2004 across the major leagues.
Can't wait to see the individual results, David. A lot of people have been clamboring for UZR, and since MGL said he won't publish them anymore, I was wondering who would step up to the plate.
>"I'm surprised the Angels come out so poorly. Maybe that explains their push for Finley and Cabrera."
I wouldn't read that far into what the Angels are doing. If they really knew what was going on with their defense, they wouldn't have moved Erstad to first base, and they most definitely wouldn't have kept him there for next season...
Cabrera shouldn't have been signed for more than two years, as a stopgap. His defense could be described (at best) as "slightly above average." And that's being generous. I think most people, after looking at the numbers, would describe him as average-to-below. Even if his defensive skills were to magically return, they wouldn't make up for his lack of offense.
Finley's defense is something of a question mark. He's definitely not GG-caliber anymore, but he put up a decent fielding season last year. Something to keep in mind is that he isn't getting any younger. I'm betting the Angels signed him more for his offense, and because they needed to replace Guillen.
Interesting that the Red Sox came out so high. I remember Theo justifying getting Cabrera and Doug M. (and losing Nomar) to shore up the poor defense.
This is more of a question for individual results, when they arrive.
Does BIS chart defensive player placement as well as the zone of the ball? I'm curious if it's possible to pull out the effect of defensive shifts.
David,
Does the "vector" include distance (which would be good), or is it simply the slice of the field (which would be bad)?
Great work David. I know that the individual results are coming soon, but are you planning on showing additional cuts of the data by team?
For example, which teams are best at fielding ground balls (or at least exceed expectations the most) or which are worst at flyballs.
Using David's results (133,092 balls in play), one standard deviation would equal .0013.
The .6911 actual DER and .6976 expected DER shows a difference of .0065. Comparing the 2004 DER to the 2002-2003 DER, and my estimate for the 02-03 DER is .7008. The difference between the 04 sample and the 02-03 sample is a whopping .0097.
Since you are talking about the same (say 80%) fielders, pitchers, hitters, parks, then external factors might have had an influence (umpires, weather, balls, bats, etc).
I think it's much more appropriate to zero out the results, such that the league difference for each year is exactly zero. It's much more reprentative of the stable nature of the players.
Unless you can isolate more parameters, applying a league fudge factor would make the results better.
Oh, and great work! I love this stuff!
I apologize in advance for all the posts.
The team standard deviation of the DER difference is .0070. What does that tell us? Plenty!
Assume that you've got 7 fielders, all of which have 4 balls in play hit to them per game, and they were selected randomly to be on the team. Question: what would be the spread in talent level of such a model, so that the .0070 team standard deviation is met (and that the .0070 is representative)?
Answer: .0185 (with my quick calculations anyway).
What does that tell you? It tells you that 68% of all MLB players will get to +/- .0185 outs per ball in play. With 600 balls in play per player per 162 games (more or less), that's 11 outs per season. Which works out to 9 runs per season.
That is, 68% of all players perform at +/- 9 runs from the mean. 95% are within +/- 18 runs.
These findings are consistent with my analysis of UZR and some other "logical" constructions.
"Finley's defense is something of a question mark. He's definitely not GG-caliber anymore, but he put up a decent fielding season last year."
I WOULD POINT YOU TO:
http://www.baseballmusings.com/archives/005830.php
AND IIRC, HIS UZR WAS RUMORED TO BE SOMEWHERE BETWEEN MANNY RAMIREZ AND, SAY, PHIL DONAHUE WITH A MIZUNO MIT ON THE WRONG HAND....
David,
From what I remember about your parameters, you have one parameter for "park". Is that correct? I would suggest that a better parameter would be "park-grid", or to get fewer parameters "park-fieldarea". Fenway LF and Fenway RF would have different biases. I think you treat them as independent, when they shouldn't be.
This will get you into problem though with Andruw Jones. You shouldn't have a model of how Atlanta affects CF by looking at Jones, and then applying that factor to him. This however is not a knock at your system, but essentially every system. Imagine applying a LH HR park factor to Yankee Stadium in the 20s, with one guy making up over half the sample.
" I would suggest that a better parameter would be "park-grid", or to get fewer parameters "park-fieldarea". Fenway LF and Fenway RF would have different biases. I think you treat them as independent, when they shouldn't be."
I realize that I'm fairly ignorant, but why wouldn't you want to treat Fenway LF and RF as independent? Wouldn't considering environmental factors be beneficial?
"You shouldn't have a model of how Atlanta affects CF by looking at Jones, and then applying that factor to him."
With regards to measuring Andruw against Andruw, wouldn't you simply ignore over-representation (i.e. the home team's starting fielder) when figuring a park factor? Ideally, wouldn't you'd want to take data from the greatest number of players (the 20(?) CFs who visited the Ted in any given year) who played the most similar number of innings at any given position, then use that as your sample size? Would you also want to drop over-representation from within the division, I suppose, to make up for the presence of a Cameron whose performance (if as superior as supposed) would "break the curve"?
I wrote poorly. When I said this:
I think you treat them as independent
I should have said:
David, I believe, you treat the park parameter and the grid parameter as independent variables. We should be treating the parameter as park-grid. That is, Fenway-LF would be a new third parameter, as opposed to only having two parameters: Fenway being one parameter, and LF being another parameter, each independent. I'm suggesting having three parameters:
park
grid
park-grid
***
As for over-representation, I'm not too concerned. I'd simply take out the player in question from the entire sample. So, Andruw Jones' backup might have a biased view. That's fine. You aren't comparing the backup to Jones, but rather you are comparing the backup to how you think his context was biased, and of which Jones supplies a great deal of the data. I can live with that.
I think you'll find that keeping or taking out Jones from the sample will have almost no effect on the backup's metric. However, keeping or taking out Jones would have some (large?) effect on Jones' metric.
That's why I recommend applying adjustment factors by taking out the player in question that you are trying to adjust.
Good stuff. With all the Finley bashing going on here, has anybody bothered to note his Rate2 increased noticeably at Dodger Stadium? Now, maybe that's because he's got two centerfielders on either side of him -- that'll take a lot of weight off one guy's shoulders -- but I'd like to see if the 2003 numbers alluded to earlier also applied for the second half of 2004.
TOLAXOR:
I said "decent." Not "BEST EVAR." =P And I was referring to last season, not 2003. I should've been more specific, noting his up-down bumpy trend, but I figured most people would be familiar enough with the numbers to know what I was talking about.
Like I said, his defense is a big question mark. The Angels probably would've been better off putting Erstad back in center and finding someone to play 1B/DH. Heck, they could've put Garrett "I Like to Swing My Bat at the Ball" Anderson at DH/1B and signed Finley to play corner (or vice versa). The defense they've been putting up in the past few seasons has really been hurting that team.
Laughed at the Donahue line, though. =)
David,
Once again, excellent work!
MGL
I'm suprised the Cubs come out so good defensively... and the Angels so bad.
I watched the D'backs most of the year (until it became too painful) and I can tell you this was a terrible defensive team. Yet they show up right in the middle in your ranking. Their DER is right near the bottom at .680, as it should be. Only the Rockies were below them (.679), and that is obviously due to Coors Field. (Can you post home and road stats?) But the D'backs Predicted DER is dead last at .684, 3 points lower than the Rockies. That makes no sense.
I think this is an excellent method of rating defense. I applaud the idea. I think it's a major step-up from fielding pct/range/zone rating. But, I think there's an inconsistency here. In any exponential model, the maximum-likelihood parameters are found when expected data equals empirical data. i.e. actual outs should equal predicted outs. Considering that outs for all teams are very close to 3000, DER should be nearly equal to expected DER. This clearly isn't the case. What model are you using? Is there are Latex write-up available somewhere?
Oversight on my part: David noted that he trained the model on 02-04 data and evaluated the model on 04 data. This is the reason for the difference between expected and actual outs.